78 research outputs found

    Mapping the Gene Ontology Into the Unified Medical Language System

    Get PDF
    We have recently mapped the Gene Ontology (GO), developed by the Gene Ontology Consortium, into the National Library of Medicine's Unified Medical Language System (UMLS). GO has been developed for the purpose of annotating gene products in genome databases, and the UMLS has been developed as a framework for integrating large numbers of disparate terminologies, primarily for the purpose of providing better access to biomedical information sources. The mapping of GO to UMLS highlighted issues in both terminology systems. After some initial explorations and discussions between the UMLS and GO teams, the GO was integrated with the UMLS. Overall, a total of 23% of the GO terms either matched directly (3%) or linked (20%) to existing UMLS concepts. All GO terms now have a corresponding, official UMLS concept, and the entire vocabulary is available through the web-based UMLS Knowledge Source Server. The mapping of the Gene Ontology, with its focus on structures, processes and functions at the molecular level, to the existing broad coverage UMLS should contribute to linking the language and practices of clinical medicine to the language and practices of genomics

    An Upper-Level Ontology for the Biomedical Domain

    Get PDF
    At the US National Library of Medicine we have developed the Unified Medical Language System (UMLS), whose goal it is to provide integrated access to a large number of biomedical resources by unifying the vocabularies that are used to access those resources. The UMLS currently interrelates some 60 controlled vocabularies in the biomedical domain. The UMLS coverage is quite extensive, including not only many concepts in clinical medicine, but also a large number of concepts applicable to the broad domain of the life sciences. In order to provide an overarching conceptual framework for all UMLS concepts, we developed an upper-level ontology, called the UMLS semantic network. The semantic network, through its 134 semantic types, provides a consistent categorization of all concepts represented in the UMLS. The 54 links between the semantic types provide the structure for the network and represent important relationships in the biomedical domain. Because of the growing number of information resources that contain genetic information, the UMLS coverage in this area is being expanded. We recently integrated the taxonomy of organisms developed by the NLM's National Center for Biotechnology Information, and we are currently working together with the developers of the Gene Ontology to integrate this resource, as well. As additional, standard, ontologies become publicly available, we expect to integrate these into the UMLS construct

    R|S Atlas: Identifying Existing Cohort Study Data Resources to Accelerate Epidemiological Research on the Influence of Religion and Spirituality on Human Health

    Get PDF
    OBJECTIVE: Many studies have documented significant associations between religion and spirituality (R/S) and health, but relatively few prospective analyses exist that can support causal inferences. To date, there has been no systematic analysis of R/S survey items collected in US cohort studies. We conducted a systematic content analysis of all surveys ever fielded in 20 diverse US cohort studies funded by the National Institutes of Health (NIH) to identify all R/S-related items collected from each cohort\u27s baseline survey through 2014. DESIGN: An R|S Ontology was developed from our systematic content analysis to categorise all R/S survey items identified into key conceptual categories. A systematic literature review was completed for each R/S item to identify any cohort publications involving these items through 2018. RESULTS: Our content analysis identified 319 R/S survey items, reflecting 213 unique R/S constructs and 50 R|S Ontology categories. 193 of the 319 extant R/S survey items had been analysed in at least one published paper. Using these data, we created the R|S Atlas (https://atlas.mgh.harvard.edu/), a publicly available, online relational database that allows investigators to identify R/S survey items that have been collected by US cohorts, and to further refine searches by other key data available in cohorts that may be necessary for a given study (eg, race/ethnicity, availability of DNA or geocoded data). CONCLUSIONS: R|S Atlas not only allows researchers to identify available sources of R/S data in cohort studies but will also assist in identifying novel research questions that have yet to be explored within the context of US cohort studies

    Markov Chain Ontology Analysis (MCOA)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Biomedical ontologies have become an increasingly critical lens through which researchers analyze the genomic, clinical and bibliographic data that fuels scientific research. Of particular relevance are methods, such as enrichment analysis, that quantify the importance of ontology classes relative to a collection of domain data. Current analytical techniques, however, remain limited in their ability to handle many important types of structural complexity encountered in real biological systems including class overlaps, continuously valued data, inter-instance relationships, non-hierarchical relationships between classes, semantic distance and sparse data.</p> <p>Results</p> <p>In this paper, we describe a methodology called Markov Chain Ontology Analysis (MCOA) and illustrate its use through a MCOA-based enrichment analysis application based on a generative model of gene activation. MCOA models the classes in an ontology, the instances from an associated dataset and all directional inter-class, class-to-instance and inter-instance relationships as a single finite ergodic Markov chain. The adjusted transition probability matrix for this Markov chain enables the calculation of eigenvector values that quantify the importance of each ontology class relative to other classes and the associated data set members. On both controlled Gene Ontology (GO) data sets created with Escherichia coli, Drosophila melanogaster and Homo sapiens annotations and real gene expression data extracted from the Gene Expression Omnibus (GEO), the MCOA enrichment analysis approach provides the best performance of comparable state-of-the-art methods.</p> <p>Conclusion</p> <p>A methodology based on Markov chain models and network analytic metrics can help detect the relevant signal within large, highly interdependent and noisy data sets and, for applications such as enrichment analysis, has been shown to generate superior performance on both real and simulated data relative to existing state-of-the-art approaches.</p

    Characteristics of undiagnosed diseases network applicants: implications for referring providers

    Get PDF
    Abstract Background The majority of undiagnosed diseases manifest with objective findings that warrant further investigation. The Undiagnosed Diseases Network (UDN) receives applications from patients whose symptoms and signs have been intractable to diagnosis; however, many UDN applicants are affected primarily by subjective symptoms such as pain and fatigue. We sought to characterize presenting symptoms, referral sources, and demographic factors of applicants to the UDN to identify factors that may determine application outcome and potentially differentiate between those with undiagnosed diseases (with more objective findings) and those who are less likely to have an undiagnosed disease (more subjective symptoms). Methods We used a systematic retrospective review of 151 consecutive Not Accepted and 50 randomly selected Accepted UDN applications. The primary outcome was whether an applicant was Accepted, or Not Accepted, and, if accepted, whether or not a diagnosis was made. Objective and subjective symptoms and information on prior specialty consultations were collected from provider referral letters. Demographic data and decision data on network acceptance were gathered from the UDN online portal. Results Fewer objective findings and more subjective symptoms were found in the Not Accepted applications. Not Accepted referrals also were from older individuals, reported a shorter period of illness, and were referred to the UDN by their primary care physicians. All of these differences reached statistical significance in comparison with Accepted applications. The frequency of subspecialty consults for diagnostic purposes prior to UDN application was similar in both groups. Conclusions The preponderance of subjective and lack of objective findings in the Not Accepted applications distinguish these from applicants that are accepted for evaluation and diagnostic efforts through the UDN. Not Accepted applicants are referred primarily by their primary care providers after multiple specialist consultations fail to yield answers. Distinguishing between patients with undiagnosed diseases with objective findings and those with primarily subjective findings can delineate patients who would benefit from further diagnostic processes from those who may have functional disorders and need alternative pathways for management of their symptoms. Trial registration clinicaltrials.gov NCT02450851 , posted May 21st 2015

    Network Analyses Reveal Novel Aspects of ALS Pathogenesis

    Get PDF
    Amyotrophic Lateral Sclerosis (ALS) is a fatal neurodegenerative disease characterized by selective loss of motor neurons, muscle atrophy and paralysis. Mutations in the human VAMP-associated protein B (hVAPB) cause a heterogeneous group of motor neuron diseases including ALS8. Despite extensive research, the molecular mechanisms underlying ALS pathogenesis remain largely unknown. Genetic screens for key interactors of hVAPB activity in the intact nervous system, however, represent a fundamental approach towards understanding the in vivo function of hVAPB and its role in ALS pathogenesis. Targeted expression of the disease-causing allele leads to neurodegeneration and progressive decline in motor performance when expressed in the adult Drosophila, eye or in its entire nervous system, respectively. By using these two phenotypic readouts, we carried out a systematic survey of the Drosophila genome to identify modifiers of hVAPB-induced neurotoxicity. Modifiers cluster in a diverse array of biological functions including processes and genes that have been previously linked to hVAPB function, such as proteolysis and vesicular trafficking. In addition to established mechanisms, the screen identified endocytic trafficking and genes controlling proliferation and apoptosis as potent modifiers of ALS8-mediated defects. Surprisingly, the list of modifiers was mostly enriched for proteins linked to lipid droplet biogenesis and dynamics. Computational analysis reveals that most modifiers can be linked into a complex network of interacting genes, and that the human genes homologous to the Drosophila modifiers can be assembled into an interacting network largely overlapping with that in flies. Identity markers of the endocytic process were also found to abnormally accumulate in ALS patients, further supporting the relevance of the fly data for human biology. Collectively, these results not only lead to a better understanding of hVAPB function but also point to potentially relevant targets for therapeutic intervention

    Comparative and Functional Genomics An upper-level ontology for the biomedical domain

    No full text
    Abstract At the US National Library of Medicine we have developed the Unified Medical Language System (UMLS), whose goal it is to provide integrated access to a large number of biomedical resources by unifying the vocabularies that are used to access those resources. The UMLS currently interrelates some 60 controlled vocabularies in the biomedical domain. The UMLS coverage is quite extensive, including not only many concepts in clinical medicine, but also a large number of concepts applicable to the broad domain of the life sciences. In order to provide an overarching conceptual framework for all UMLS concepts, we developed an upper-level ontology, called the UMLS semantic network. The semantic network, through its 134 semantic types, provides a consistent categorization of all concepts represented in the UMLS. The 54 links between the semantic types provide the structure for the network and represent important relationships in the biomedical domain. Because of the growing number of information resources that contain genetic information, the UMLS coverage in this area is being expanded. We recently integrated the taxonomy of organisms developed by the NLM&apos;s National Center for Biotechnology Information, and we are currently working together with the developers of the Gene Ontology to integrate this resource, as well. As additional, standard, ontologies become publicly available, we expect to integrate these into the UMLS construct. The development of an ontology is generally motivated by a particular problem that its developers are attempting to solve. In most cases, the problem itself will have a significant impact on the design, implementation and further development of the ontology. In addition, the developers&apos; domain expertise, experience in knowledge representation methodology, and ability to maintain the ontology over a long period of time, all determine the final outcome. While there is some disagreement about what qualifies as an ontology, most agree that an ontology is a representation of a domain of interest, which, at a minimum, involves naming the basic concepts in that domain The purpose to which the ontology will be put determines the nature and type of ontology that is created. While a simple list of controlled terms can be sufficient for indexing documents and other datasets, even here some complexity, e.g. in the form of synonyms, is often added once the terminology is put to use. Placing the concepts in a hierarchy provides another level of complexity
    • 

    corecore